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I. INTRODUCTION 



The mechanical unfolding of biopolymers has been the subject of an intense research 
activity, both experimental and theoretical, in the last two decades. For a recent review, see^. 
Innovative single molecule experimental techniques, mainly based on atomic force microscopy 
(AFM) and optical tweezers, have been used to investigate the response of biopolymers to 
controlled forces, while theoretical and computational models at different levels of coarse 
graining have been proposed and investigated. 

Among the various molecules studied, fibronectin is particularly important, due to its role 
in tissue elasticity, cell adhesion and cell migration^. Its 10th type III module {Fnlllio) is 
known to be crucial for cell adhesion, through the binding of its RGD motif to transmem- 
brane integrin receptors. The secondary structure of this module consists of 2 antiparallel 
/3-sheets forming a /3-sandwich. The /3-strands are usually denoted with letters from A (the 
strand closest to the N terminal) to G (the C terminal one). The two sheets are made of 
strands ABE and DCFG, respectively, and the RGD motif is in the loop separating strands 
F and G. 




FIG. 1: Sketch of the native structure of Fnlllio (Protein Data Bank ID Ittf) with 
/9-strands labeled A-G in sequence order. Figure generated by PyMOL. 

The mechanical unfolding of FnllliQ has been studied both experimentally^*^ and by 
computer simulations^"-. Single molecule AFM experiments have shown that Fnlllio has 
a low mechanical stability, compared to other fibronectin type III domains^. Furthermore, 
AFM experiments by the same group^ showed that Fnlllio can unfold according to differ- 
ent pathways. Apparent two-state transitions were observed, as well as unfolding through 
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intermediate states. Experiments on suitable mutants suggested the possible existence of 
two different intermediate states, which is also consistent with some simulations^!^, while 
other simulations predicted simpler-*^ or more complex- scenarios. 

In the present paper we shall study the mechanical unfolding of Fnlllio by means of a 
generalized Ising-like model we have recently proposed^"—. The model has already been 
showni^iii to reproduce the general features of mechanical unfolding experiments, like the 
force dependence of the average unfolding time in a constant force protocol, or the rate de- 
pendence of the unfolding force in a constant rate protocol, together with the corresponding 
probability distributions. The same model turned out to predict the correct values for the 
unfolding lengths of a titin domain^^iii and of ubiquitin^^. Moreover, it has been used to in- 
vestigate the unfolding pathways of ubiquitin^^ and of a 236-base RNA fragment^, and the 
resulting pathways turned out to be consistent with both experimental and computational 
results, where more detailed molecular models were used. 

In the case of Fnlllio we are particularly interested in exploring the biologically relevant 
low force regime^'i^'^^, which is thought to be close to the equilibrium unfolding force and 
cannot be explored by simulations of more detailed, and more computationally expensive, 
molecular models. Our model can probe forces close to the equilibrium unfolding force, 
whose value we use to set our force unit. Such value is unfortunately not exactly known. 
Erickson^ estimates the equilibrium unfolding force to be at most 5 pN, on the basis of an 
order of magnitude calculation. On the other hand, an estimate close to 20 pN was reported 
in^. Choosing the value of 20 pN, in order to set our force unit, our results give unfolding 
forces in very good agreement with the AFM experiments (see Sec. lIVp . 

The paper is organized as follows: in Sec. [ITl we shall describe our model and simulation 
techniques; some equilibrium results are discussed in Sec. Illlj in Sec.[IV]we shall describe our 
results for the unfolding pathways, using both constant force and constant velocity protocols; 
finally, in Sec. |V]we shall draw some conclusions. 
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II. MODEL AND METHODS 



A. Model 

We represent the polypeptide chain pulled by an external force through a simple Ising- 
like modeliSi""— , that is a generalization of the Wako-Saito-Munoz-Eaton (WSME) Go-type 
model for protein folding^"—. 

In this model a (A^ + 1) residues polypeptide chain is described by binary variables 
ruk, {k = 1, . . . , N), associated to the peptide bonds. The variable nik is equal to 1 if the 
k-th bond is in a native state and if the bond is in an unfolded state. 

In the following a native stretch will indicate a sequence of consecutive amino acids 
connected by native bonds and delimited by two non-native bonds. In order to characterize 
the state of such a stretch, we introduce the quantity 

Sij = {1- mi) I JJ mfe j (1 - ruj) , (1) 
\k=i+i ) 

with (0 ^ z < j ^ + 1) and the boundary conditions mo = ?Ti7v+i = 0. Thus, a native 
stretch delimited by bonds i and j is characterised by Sij = 1, while if the sequence is not a 
native stretch, Sij = 0. In the case oi j = i + 1 the stretch corresponds to the single (z + l)-th 
residue. Therefore, given a certain configuration m = {m^} of the chain, the number M of 
native stretches is equal to M = 1 + Xlili ^ ''^i)- 

In the WSME model two amino acids can interact only if they are in contact (i.e. if they 
have at least a pair of nonhydrogen atoms closer than 4 A in the native structure deposited 
in the Protein Data Bank (PDB)) and if they belong to the same stretch. The effective 
Hamiltonian reads: 

7V-1 N j N 

H {'m,q) = ^ eijAijY\_rnk - kBT'^qi{l - rrii) , (2) 

1=1 j=i+l k=i 4=1 

where e^j is the energy gain associated to the contact between z-th and {j + l)-th amino 
acids (see next subsection for details), A is called contact matrix and its element takes 
the value A^- = 1 if such a contact exists in the native structure while = otherwise or 
if J = z + 1 (since if the amino acids are so close in the chain sequence, they have pair of 
atoms closer than 4 A even in the unfolded state). The quantity > is the entropic cost of 
ordering bond i. In the following we define hij = CijAjj. In Ref.— it has been discussed how 
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to compute exactly the partition function of the WSME model through a transfer-matrix 
formalism. 

In our generalized model we substitute the entropic term for the coupling to the force 
described by a potential energy function V depending on the end-to-end length of the protein 

N-l N j 

H{m,C) = Yl Yl h^Jl[mk + V{C) . (3) 

1=1 j=i+l k=i 

In our simulations the protein is pulled either by a constant force or at constant pulling 
velocity. In the first case the force / applied to the protein ends is constant and the potential 
energy takes the form V (C) = —fC, while in the second case the energy is time-dependent: 
y (£) = I (£o + vt- Cf, where k is a spring constant, v is the pulling velocity and Co is 
the initial equilibrium elongation. 

In order to define the length C we assume that each stretch {Sij = 1) can be only parallel 
or antiparallel to the direction of the applied force and we implement such an assumption 
through a new binary variable aij that can take the values +1 or —1 respectively. The end- 
to-end length of the protein is defined as the sum of the lengths ly of each stretch multiplied 
by aij-. 

N N+1 

C{m, c^) = X] kjCiijSij , (4) 

j=0 j=i+l 

where the dynamic variable cr is a set of M variables cTjj, each one associated to a stretch. 
Since the z-th aminoacid is represented by the sequence of its Ni nitrogen, Ca.i central 
carbon and Ci carbon atoms, the lengths Uj are obtained from the PDB structure as the 
native distance between the midpoint of the Ci_i and atoms and the midpoint of the Cj 
and Njj^i atoms. 

Dealing with the case of constant force, since the variables aij do not interact among 
themselves, it is possible to obtain an effective Hamiltonian which has the same structure 
of the Hamiltonian ([2]) of the initial model and therefore the equilibrium thermodynamics 
is exactly solvable also in this case. In fact, given the Hamiltonian: 

N~l N j N N+1 

H {m, a;f) = Y^ ^ hijYl^k- ^v^v^iJ ' (^) 

1=1 j=i+l k=i i=0 j=i+l 

we can perform the sum on the a variables in the partition function: 

Z{f) = Y,Y1 e-'^^^'"''^^^) = Yl e-^^-^'"^^) , (6) 

{m} {cr} {m} 



with 

N-l N j N N+l 

H.ff {m; /) = 5^ 5^ /i^i n rn, - - ^ ^ In [2 cosh S,, . (7) 

i=l j=i+l k=i i=0 j=i+l 

In the case of force / = the last expression reduces to eq. ([2]) with gj = In 2 for every i. 

B. Model parameters, simulation and analysis 

The parameters eij, as [^1^-22.^^^^ are taken equal to ne where n is an integer such that 
5(n — 1) < Hat ^ 5n, and Uat is the number of pairs of atoms in contact (that is, closer than 4 
A) between z-th and {j + l)-th amino acids. The temperature is set to T = 0.768 T^, where 
Tm is the equilibrium unfolding temperature at zero force. Since experimentally = 375 
we have T = 288 K. The force unit is then set in such a way that the equilibrium 
typical unfolding force at T = 288 K is 20 pN. Since an experimental measurement of this 
quantity is missing, it has been chosen on the basis of the estimates reported in^. A detailed 
discussion about the choice of energy and force scales in the model has been reported 

The nonequilibrium unfolding kinetics have been studied by Monte Carlo (MC) sim- 
ulations. More precisely, in the framework of a master equation approacb^^, we choose 
transition rates according to the Metropolis algorithm. Rigorously speaking, this choice 
cannot be derived from an underlying microscopic dynamics of the molecule. Nevertheless, 
it has been showo^"— that it reproduces many quantitative and qualitative aspects of fold- 
ing and unfolding of real molecules under an external force. A single MC step consists of 
a single-bond flip on the variable rrij, chosen with equal probability among the peptide 
bond variables, followed by a single-spin flip on the variable o"jj, also chosen with uniform 
probability among the M stretch orientational variables^. In Sec. llVt by comparing our 
estimated zero-force unfolding time with the corresponding experimental value, we shall 
find that a MC step corresponds to about 25 ns. 

Simulations have been run with nine values of the force (122 pN, 98 pN, 81 pN, 65 pN, 
53 pN, 46 pN, 40 pN, 36 pN, 28 pN) and six constant pulling velocities (0.03 fxm/s, 0.05 
fim/s, 0.1 /xm/s,0.3 fim/s, 0.5 fim/s, 1 fim/s), for each value of the force or of the velocity 
100 different unfolding trajectories have been considered. 

Each simulation stops 10^ MC steps after the protein reaches the value L„ = ^Lmax = 
\ X^ilo Kw- exception is the case / = 28 pN where we take = \Lmax-, because of the 
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larger length fluctuations and in order to prevent the trajectory ending before a complete 
unfolding event takes place. 

In order to trace unfolding pathways, we use the weighted fraction of native contacts as 
order parameter: 

~ v'-^w-iv^w h ' ^ ' 

where s is the string of bonds we are analysing and ri(s), r2(s) its first and last peptide 
units. As an example the string containing strands A and B has ri{AB) = 6 and r2{AB) = 
23. A straightforward generalization is necessary for order parameters of strings of non- 
consecutive strands (i.e. C-F and B-E). (ps turns out to be a better order parameter than 
the fraction of native bonds used in previous works because of its greater stability with 
respect to fiuctuations. When discussing the folded or unfolded character of an individual 
/3-strand, appropriate order parameters can be identified on the basis of the secondary 
structure. As an example, strand F appears in a /3-sheet between strands C and G, which 
suggests to use (f)cF and (f)pG as order parameters for strand F. 



III. EQUILIBRIUM PROPERTIES 

As mentioned before, the equilibrium thermodynamics can be solved exactly in our model 
and we can thus follow the macroscopic state behaviour of the protein at different pulling 
forces. In^i the average fraction of native bonds and end-to-end length are plotted as 
functions of the force /. 

To obtain the equilibrium energy landscape of the protein as a function of the reaction 
coordinate L we expand, foUowingiii^, the partition function ([6]) in powers of e^^ as 

L/=Liaa.x 

where Zq{L) is a zero force partition function constrained at the length value L: 

ML) = E E - ^("^' ^)) e-^^(-'-^=°) 

m a 

which, as 2{f), can be computed exactly. The corresponding free energy reads: 

Fo{L) = -kBT In Zo{L). 
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In presence of a constant force, the free energy landscape is tilted and is given by G{L) = 
Fq{L) — fL. Fig. |2] shows the landscape for various forces: at zero force there is just 
one minimum at about 3.5 nm corresponding to the folded state. By increasing the force 
three more minima appear: two of them (end-to-end lengths of about 6 and 13 nm) are 
always local minima, and will be later associated to intermediate states, while the third 
one, corresponding to the fully unfolded state, becomes the global minimum when the force 
exceeds 20 pN. 

120 I : : : : : 1 
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FIG. 2: Free energy landscape at temperature T = 288 K and for forces / = pN (red 
line), / = 20 pN (green line) and / = 28 pN (blue line). AG = G{L) - G{0). 

IV. UNFOLDING PATHWAYS 
A. Force Clamp 

In the force clamp protocol the molecule is first equilibrated in absence of force, then 
at t = the force instantaneously jumps to a non-vanishing constant value, which ranges 
between 28 to 122 pN. Notice that the forces we use are much closer to the equilibrium 
unfolding force, and hence to in vivo conditions, than most previous works, since more 
detailed models can be simulated only for very short time intervals. The smallest force 
probed by Karplus and Paci^ was 69 pN, and they did not observe any unfolding event at 
this force, while Gao et aP' used forces not smaller than 400 pN. Only in the all-atom Monte 
Carlo simulations by Mitternacht et al^ unfolding events at constant forces as small as 50 
pN could be observed. 

In Fig. |3]we report the average unfolding time as a function of force /. Three regimes 
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are clearly distinguishable. In the high force regime the unfolding time saturates to a 
constant plateau, as observed for several other proteins^. In the low force regime (25 to 60 
pN) we have made a fit to the Arrhenius' law 



Tu = To exp 



obtaining the unfolding length x„ = 3.4 ± 0.1 A, which compares well, given the extreme 
simplicity of our model, with the experimental results Xu = 3.8 A-. Comparing our zero- 
force unfolding time tq with its experimental value Tgxp = 50 s^, we find out that a single 
MC step in our model corresponds to about 25 ns. In the intermediate force regime (60 to 
115 pN) our fit yields x„ = 1.3 ± 0.1 A. 




25 50 75 100 125 150 175 200 225 250 
f(pN) 



FIG. 3: Mean unfolding time function of the force / applied to the molecule 

(average over 100 different trajectories). The red line is a fit to the Arrhenius' law in the 
range of forces from 25 to 60 pN. In this range we find from the fit Xu = 3.4 ±0.1 A. The 
green line is a fit from 60 to 115 pN, x„ = 1.3 ± 0.1 A. 



The unfolding trajectories can be grouped in four classes according to their main features, 
i.e. their end-to-end length plateaus (if they exist) and the order parameters behaviour for 
the whole molecule and its various pairs of /3-strands. 

At large forces we observe simple 2-state trajectories, while at smaller forces various 
intermediates are obtained. A scheme of the possible pathways is shown in Fig. |H 

In trajectories exhibiting intermediate states it turns out, as already pointed out in pre- 
vious papers^i^, that strand G is always the first to break away. In cellular environment such 
behaviour seems to be connected to the function of the RGD motif Arg78-Gly79-Asp80^. 
When the module is fully folded, the RGD motif is available for adhesion, while if strand G 
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is pulled and detached from the remainder of the module, the RGD motif gets closer to the 
surface of the module and is not functional. 

Strand G detachment may be rapidly followed by complete unfolding or by an intermedi- 
ate state. A possibility is that strand A detaches almost at the same time of strand G while 
the remaining part of the molecule stays folded for a certain time before complete unfolding. 
This kind of unfolding pathway will be labeled with AG, its intermediate end-to-end length 
is about 13.5 nm. It may happen that instead of strand A, strand F detaches together with 
G, such unfolding pathway (intermediate end-to-end length ~ 14 nm) will be labeled GF. 

The last possibility occurs only in the biologically relevant regime of low forces. It is 
believed^ that such relevant forces, in vivo, are of the same order of magnitude as the 
equilibrium unfolding force (~ 20 pN, see section [In|) . though forces as low as 5 pN have 
been suggested^ as typical unfolding forces. Our low force unfolding pathway is a mixture 
of the previous two: strands A and G are the first to unfold, then, before the molecule 
completely unfolds, A refolds and F unfolds. This may happen reversibly many times in 
a single trajectory with consecutive folding (unfolding) of strand A and parallel unfolding 
(folding) of strand F. Such trajectories will be labeled mixed AG-GF because the molecule 
is fluctuating between two different intermediates {AG and GF). These intermediates have 
almost the same end-to-end length, and therefore cannot be distinguished in a simple free 
energy landscape, as illustrated in Fig. [21 where a single, broad minimum is observed at 
L ^ 13 nm. 

Fig. [5] shows two typical trajectories at 65 pN constant force. Other typical trajectories 
are plotted in^. During thermalization, before turning the force on at time t = 0, the length 
of the polypeptide chain fluctuates around L = (FigJS^), since different orientations of the 
molecule are equally likely. Then, at time t = 0, a. waiting phase starts, which can be easily 
seen in FigJSb and Fig|5li. This waiting phase corresponds to a metastable state which 
is characterised by an end-to-end length ~ 3.5 nm corresponding to the elongation in the 
native state. The rise in the end-to-end length to the intermediate value is always associated 
to the drop in order parameters connected to two different pairs of strands, with the GF 
pair always involved. The order parameters which have not been plotted go to zero only 
when the protein reaches the fully elongated configuration (end-to-end length ~ 29 nm). 

In Figj6]we report a mixed AG-GF trajectory obtained at force / = 28 pN, slightly larger 
than the equilibrium unfolding force: after the long waiting phase there are two different 
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FIG. 4: Unfolding pathways scheme of Fnlllio pulled by a constant force. Transitions 
denoted by red arrows have been observed only at low forces (40, 36 and 28 pN). Oblique 

red arrows represent refolding transitions. 
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(a) Unfolding pathway: AG 
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(b) Unfolding pathway: GF 



FIG. 5: Typical MC trajectories: end-to-end length (red line) and a few order parameters 
as functions of time, with a / = 65 pN. Green line: fraction of native contacts, whole 
FnllliQ. Blue line: fraction of native contacts between strands G and F. Purple line: 

fraction of native contacts between strands A and B. Cyan line: fraction of native contacts 

between strands C and F. 
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FIG. 6: Mixed AG-GF trajectory: MC time evolution of the end-to-end length (red line) 
and of some order parameters with a constant force of 28 pN. Colors as in Fig. [5] 



intermediate states before the complete unfolding. Despite the large fluctuations in the 
order parameters associated to the pairs C-F and A-B, it is still possible to see their general 
behaviour and to recognize the first intermediate state as AG and the second as GF. We 
stress that the GF and AG intermediates have very similar end-to-end length and fraction 
of native contacts (ps (for the whole chain), making them indistinguishable in simple, one- 
dimensional, free energy landscapes: indeed, they are lumped together in the broad minimum 
at L ~ 13 nm in Fig. [2l 

Both the waiting and intermediate states (as the whole unfolding process) are charac- 
terised by time lengths varying in a wide range of values for different applied forces and, 
because of stochasticity, for different trajectories. In Table [T] we reported the mean life times 
at various constant forces. The times tag and tgf are obtained by an average of the times 
occurring between the first and the second jump in the end-to-end length, which have been 
fixed using the respective threshold values L = 75A and L = 225A. These averages have 
been calculated only for those trajectories which exhibit the corresponding unfolding path- 
way, while Tag and tgf at force / = 28 pN and tgf at force / = 40 pN are not reported 
in the table because of vanishing frequencies of the corresponding trajectories, as shown in 
Table [Tll The mean waiting phase time r^^ is the average over the 100 trajectories of the 
time at which the end-to-end length becomes longer than the threshold value L = 75A. For 
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forces / = 98 and 122 pN it does not make sense to define a waiting phase life time, since 
the protein starts to unravel as soon as the external force is applied at t = 0. Finally, the 
unfolding mean time r„ is the average on all the trajectories of the unfolding time, i.e. the 
time at which the molecule reaches the unfolding length previously defined. 

The probability distributions of intermediate life times for / = 81 pN have been plotted in 
Fig. [7l where it can be seen that both distributions can be fitted to the negative exponential 
function P{ts) = ;^exp|^| (where s is AG or GF, tg is the intermediate life time of 
s, Tg = (tg) its average), and that AG has a longer life than GF. Being the unfolding 
time the sum of the waiting phase time and of the intermediate state time we can naively 
conclude that if the protein follows the GF pathway, it will reach the unfolded state earlier. 
For the same reason and since at / = 81 pN the dominant contribution to the unfolding 
time comes from tgf and tag we can argue that at this force the exponential function fits 
well the unfolding times distribution tocM^. Furthermore, at very high forces a lognormal 
distribution of unfolding times has been proposed^. FigJH] shows this behaviour at force 
/ = 150 pN and the corresponding fit to P{tu) = —r?. — t. — -r exp 

Looking at data in Table H] we can try to interpret the three different force ranges in Fig. 
131 In the highest force range there is neither an intermediate state, nor a waiting phase and 
the unfolding time corresponds mainly to the MC time needed for completing the unfolding 
where every MC move that unravels the molecule and thus increases the length is accepted 
and every move that reduces the length is refused, that is, an extremely biased random 
walk, corresponding to the scenario proposed in^. Lowering the force the contributions of 
Tgf and tag to the global unfolding time become important while the waiting phase, if it 
exists, is still quite short. Finally, in the lowest force interval, also the waiting phase gives 
its contribution and this matches with the larger slope of the fit line. 

Table Ull shows the frequencies of various unfolding pathways. Predictably, as the force 
increases, the trajectories without any intermediate state become dominant and we expect 
them to be the only escape route at even higher forces, as already observed in previous 
all-atom simulations^. 

At / = 28 pN, because of long life times and great fluctuations, all the trajectories are of 
mixed AG-GF type. Furthermore, at such a low force, the molecule can completely refold 
after it partially unravelled. This can happen many times before complete unfolding^i. 
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FIG. 7: Histograms of the intermediate life times for AG pathway (red hne) and GF 
pathway (green hne) at force / = 81 pN. Data obtained from 3600 different trajectories. 

The hnes are exponential fits. 
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FIG. 8: Histograms of the unfolding times at force / = 150 pN. Data obtained from 5000 
different trajectories and the bin size of the histogram is 1. The fit is to a lognormal 

distribution. 



B. Constant velocity 



We run MC simulations at six different puUing velocities (0.03, 0.05, 0.1, 0.3, 0.5, 1 
/xm/s) with a spring constant A; = 30 pN/nm and an initial length Cq = 32A. Once again, 
our conditions are much closer to experimental ones than most previous simulations. In 
constant velocity simulations, Vogel et al^ used v =50 m/s (with a spring constant of ~ 4 
nN/nm), Klimov and Thirumalai^ considered = 6 mm/s or faster, while experimental 
pulling speeds^"^ were 0.4 and 0.6 /im/s (with spring constants of 45-50 pN/nm) and in vivo 
pulling speeds are believed to be even smaller. Only the all-atom Monte Carlo simulations 
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TABLE I: Unfolding time (tu), waiting phase life time (r^s), AG intermediate life time 
(tag) and GF intermediate life time (tgf) at different constant forces. Values are in MC 
steps and are approximated averages on 100 different trajectories at each force. 





r 


u 




TAG 


Tgf 


28 pN 


2.2 • 


10^ 


8.7- 


10^ 








36 pN 


3.0 • 


10^ 


8.8 • 


10^ 


4.3 


10^ 




A r\ AT 

40 pN 


8.8 • 


10^ 


2.8 • 


10^ 


6.1 


10^ 




46 pN 


2.9 • 


10^ 


6.9 • 


10^ 


2.4 


10^ 


1.4-10'^ 


53 pN 


1.1 ■ 


10^ 


2.2 • 


10^ 


1.2 


10^ 


9.4- 10^ 


65 pN 


2.7- 


10^ 


3.6 • 


10^ 


3.8 


10^ 


2.9 • 10^ 


81 pN 


6.6 • 


10^ 


1.1 • 


102 


1.5 


10^ 


1.2 • 10^ 


98 pN 


1.9 • 


10^ 






4.1 


10^ 


7.4 • 102 


122 pN 


1.5 • 


102 






5.7 


102 


1.9 • 102 



by Mitternacht et al^ could probe constant pulling speeds in the same range as we are 
considering here (with a spring constant of 37 pN/nm). 

In FigJTO] we sketch the possible unfolding pathways scheme in the constant velocity 
case. Consistent with our constant force results and with previous simulations^i^i^, at each 
value of V most of the trajectories start with the detachment of strand G, giving rise to an 
intermediate corresponding to the shallow minimum around 6 nm in Fig. [2j In few runs 
strand A is the first to unravel, but then it refolds, with the consequent detachment of 
G. Then the unfolding continues through a phase in which strand A is gradually unzipped 
and when this unzipping is completed the molecule reaches the intermediate AG (end-to- 
end length ~ 13.5 nm). Again we found a mixed AG-GF behaviour: some trajectories 
do not stay in the AG intermediate till the complete unfolding but they may jump from 
AG to GF intermediate (end-to-end length ~ 14 nm) and back. Table IIIII reports the 
relative frequencies of various unfolding pathways. It is worth noting that, since statistical 
fluctuations are greater at low pulling rates, the number of mixed AG-GF trajectories and 
the number of trajectories in which strand A unravels before strand G grows as pulling 
velocity decreases. Typical trajectories are reported in Fig. [TTl in Fig. [121 and in^i. 
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TABLE II: Relative frequencies of unfolding pathways at constant force. 100 trajectories 

for each value of the force. 





AG 


GF 


no intermediates 


mixed AG-GF 


28 pN 











1 


36 pN 


0.07 








0.93 


40 pN 


0.96 








0.04 


46 pN 


0.93 


0.07 








53 pN 


0.67 


0.32 


0.01 





65 pN 


0.59 


0.31 


0.1 





81 pN 


0.41 


0.34 


0.25 





98 pN 


0.43 


0.23 


0.34 





122 pN 


0.2 


0.06 


0.74 






The average rupture force of the native state depends on the pulling rate and is reported 
in Table HVl At the puUing speed considered, the average rupture force we obtained for the 
native state ranges between 80 to 100 pN, which is in remarkable agreement with the AFM 
results. Fernandez and coworkers reported 75 pN when pulling at 0.6 /im/s^ and 100 pN at 
0.4 fim/^. Mitternacht et al^ reported values from 88 pN at 0.03 fim/s to 114 pN at 0.1 
/im/s. Notice that in our results the average rupture force increases with the pulling speed, 
as predicted by theories^"— and verified in experiments^. AFM results showed a different 
behaviour, and this was attributed to the interactions with the other modules building up 
the polyprotein which is actually pulled in such AFM experiments^. In the same work, 
the average unfolding force of the intermediate states was reported to be 50 pN. Pulling on 
suitable mutants, two kind of intermediates were inferred on the basis of experimental results, 
namely G and AB. In our model we did not observe intermediate AB, while intermediate G 
has an average rupture force between 40 and 50 pN. The other intermediates we observed, 
AG and GF, are more stable, with average unfolding forces around 70 pN. 

The distribution of the unfolding forces is well fitted by the theoretical result^ 



tor 



(9) 



rXutQ 

Such an equation corresponds to the rupture force probability distribution of a single molec 
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TABLE III: Relative frequencies of unfolding pathways. 100 trajectories for each value of 

the velocity. 





G 


A ^ G 




AG 


mixed AG-GF 


AG 


mixed AG-GF 


1 /im/s 


0.82 


0.15 


0.03 


0.00 


0.5 /im/s 


0.76 


0.20 


0.03 


0.01 


0.3 /iin/s 


0.49 


0.39 


0.09 


0.03 


0.1 fim/s 


0.11 


0.85 


0.01 


0.03 


0.05 fim/s 


0.08 


0.87 


0.01 


0.04 


0.03 iim/s 


0.07 


0.79 


0.00 


0.14 



ular bond subject to a force that increases linearly with a rate r— . In Fig. [9] we plot the 
unfolding force histogram at ^; = 0.5 fim/s, and fit the data to eq. (jH]), with a = r ■ and 
Xu as fitting parameters. The fit gives a;^ = 8.0 A, which is larger than the value found for 
the constant force set-up, but it must be kept in mind that the above theoretical result was 
derived for a force which is linear in time with a slope r, while here the force is associated 
to the harmonic potential which moves at constant velocity v. 

0.16 
0.14 
0.12 
0.1 
O.OS 
0.08 
0.04 
0.02 


40 60 80 100 120 140 

force f (pN) 

FIG. 9: Distributions of the rupture forces of the native state at pulling velocity 
V = 0.5 /im/s. Data obtained from 500 different trajectories; bin size of the histogram is 2. 

The fit is to Eq. El 
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FIG. 10: Unfolding pathways scheme of Fnlllio pulled at constant velocity. Intermediate 
states in the full square boxes have a rupture force remarkably higher than those in dashed 

boxes. 



TABLE IV: Average rupture forces. 



rupture 


N ^ G 


G ^ AG 


AG ^ U 


GF ^ U 


1 /xm/s 


98.5 ± 6.4 


40.8 ±2.6 


99.6 ±9.9 


77.3 ± 7.7 


0.5 fim/s 


96.1 ±6.1 


42.2 ±2.6 


96.5 ±7.3 


77.5 ±4.0 


0.3 /xm/s 


94.5 ± 6.9 


43.4 ±2.4 


92.4 ± 7.8 


76.5 ±5.8 


0.1 /xm/s 


89.0 ± 6.5 


45.4 ± 1.8 


86.5 ±7.9 


69.9 ±5.8 


0.05 iim/s 


87.8 ±5.1 


46.1 ± 1.6 


81.9 ±8.4 


67.6 ±5.9 


0.03 /.tm/s 


87.3 ± 5.8 


46.9 ±1.7 


81.5 ±9.7 


66.7 ±5.4 



V. CONCLUSIONS 

We have simulated constant force and constant pulling speed unfolding of Fnlllio, the 
tenth type III domain of fibronectin using an Ising-like model we have developed and val- 
idated in recent years, whose equilibrium thermodynamics is exactly solvable. Force and 
time units have been determined by comparison with existing estimates of the equilibrium 
unfolding force and the zero-force average unfolding time. We can probe force and speed 
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100000 200000 300000 400000 500000 600000 

Time { MC steps ) 

(a) Unfolding pathway: G — !■ AG 




460000 480000 500000 520000 540000 560000 580000 600000 620000 

Time { MC steps ) 

(b) Unfolding pathway: G — ?• mixed AG-GF (partial) 

FIG. 11: MC time evolution of the end-to-end length (red line) and of a few order 
parameters with a constant velocity of 1 /im/s. Green line: weighted fraction of native 
contacts, whole FnllliQ. Blue line: weighted fraction of native contacts between strands 
G and F. Purple line: weighted fraction of native contacts between strands A and B. Cyan 
line: weighted fraction of native contacts between strands C and F. 

ranges close to in vivo and experimental conditions, which was not possible in most previous 
simulations. 

At high enough constant force we observed two-state transitions only. At smaller forces 
and at all pulling speeds considered we observed several intermediates, denoted by A, G, AG 
and GF, based on the strands which are unfolded in each intermediate. Possible unfolding 
pathways are summarized in Fig. H] for the constant force protocol and in Fig. [10] for the 
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2e+06 4e+06 6e+06 8e+06 1e+07 1.2e+07 

Time ( MC steps ) 



FIG. 12: MC time evolution of end-to-end length (red line), force (yellow line) and of some 
order parameters with a constant velocity of 0.03 /im/s for a mixed AG-GF trajectory. 
Green line: weighted fraction of native contacts, whole FnllliQ. Blue line: weighted 
fraction of native contacts between strands G and F. Purple line: weighted fraction of 
native contacts between strands A and B. Cyan line: weighted fraction of native contacts 
between strands C and F. Bins of 2000 MC steps have been used to reduce fluctuations in 

the plot. 

constant pulling speed protocol. 

The unfolding pathways depend on the applied force or on the pulling speed, which was 
already observed in^. Such pathways become more complex at low forces and speeds, due 
to the increase in fluctuations. Previous simulations and experiments showed some discrep- 
ancies in the unfolding pathways, and our work is not going to resolve such discrepancies, 
but some general trends are confirmed. In particular, the most frequently observed inter- 
mediate in our trajectories was AG, which was observed in all previous simulations^"-. In 
addition, constant pulling speed trajectories always visit intermediate G, which was also ob- 
served in most previous simulations^!^""- and in AFM experiments^. On the other hand, we 
have never observed intermediate AB, which has been reported in many simulationa^"^ and 
experiments^. We have instead observed, at low enough forces and speeds, intermediates 
A and GF, which were previously reported only by Gao et al^ (A only) and Mitternacht 
et al^ (both A and GF). These intermediates have end-to-end lengths close to G and AG, 
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respectively, and cannot be distinguished in the usual one-dimensional free energy landscape 
using the end-to-end length as a reaction coordinate. Interestingly, in our trajectories we 
observe fluctuations between intermediates with similar lengths, that is between A and G 
or between AG and GF. Fluctuations between AG and GF, in particular, are observed in 
most trajectories at the lowest forces and pulling speeds we have considered, and therefore 
one could speculate that they have some biological signiflcance. 

/.From a more quantitative point of view, given the extreme simplicity of our model, it 
is remarkable that many quantities we can compute agree well with the results from AFM 
experiments or previous simulations with similar parameters. Our estimate for the native 
state unfolding length is Xu = 3.4±0.1 A, to be compared with Xu = 3.8 A from AFM results^ 
and with x„ = 4 A from the simulations by Mitternacht et aM. The average rupture force 
we obtained for the native state is in the range 80 to 100 pN, to be compared with results 
from 75 to 100 pN reported by AFM studies^i^, and from 88 to 114 pN in the simulations by 
Mitternacht et al^. Finally, our intermediate G has an average rupture force between 40 and 
50 pN, to be compared with 50 pN found in experiments^, though it must be mentioned that 
in such work the intermediate might be an average between the G and AB intermediates. 
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Appendix to "Pathways of mechanical unfolding of 
FnllliQi low force intermediates" 

In this appendix we report some additional results, complementary to those reported in 
the main paper. 

VI. EQUILIBRIUM PROPERTIES 

As mentioned in the main text, the equilibrium thermodynamics of the present model 
can be solved exactly. Fig. lA. 13] shows that a sharp transition in the fraction of native bonds 
and in the end-to-end length of the molecule occurs upon increasing the applied external 
force at about 20 pN. 



1 — ^ , , , , 1 300 




10 15 20 25 30 35 40 45 
force t (pN) 



FIG. A. 13: Average fraction of native bonds (red line) and end-to-end length (green line) 
as a function of the external force. The temperature has been fixed to 288 K. 

VII. UNFOLDING PATHWAYS 
A. Force Clamp 

In fig. lA. 141 two typical unfolding trajectories at constant force / = 65 pN are plotted. 

In fig. lA.lSl a typical unfolding trajectory at low force / = 28 pN is plotted: the molecule 
hops back and forth between the folded and the partially unfolded state, before a complete 
unfolding event takes place. 
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(a) Unfolding pathway: AG 



300 
275 
250 
225 
200 



-^H 25 
30000 



4000 6000 8000 



10000 12000 
Time (IVlc steps) 



326 
^ 300 
275 

- 250 

- 225 

- 200 

- 175 

- 150 

- 125 



14000 16000 18000 



(b) Unfolding pathway: no intermediates 



FIG. A. 14: Typical MC trajectories: end-to-end length (red line) and a few order 
parameters as functions of time, with a / = 65 pN. Green line: fraction of native contacts, 
whole FnllliQ. Blue line: fraction of native contacts between strands G and F. Purple 
line: fraction of native contacts between strands A and B. Cyan line: fraction of native 

contacts between strands C and F. 

B. Constant velocity 



In fig. (]A.16p . we plot an unfolding trajectory A — G — AG, for the constant velocity 
set-up. 
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FIG. A. 15: Typical MC trajectory exhibiting sequential partial unfolding-refolding at low 
force / = 28 pN: end-to-end length (red line) and fraction of native contacts for whole 

FnllliQ (green line). 




100000 150000 200000 250000 300000 350000 

Time ( MC steps ) 



(a) Unfolding pathway: A G — >■ AG (partial) 

FIG. A. 16: MC time evolution of the end-to-end length (red line) and of a few order 
parameters with a constant velocity of 1 fim/s. Green line: weighted fraction of native 
contacts, whole FnllliQ. Blue line: weighted fraction of native contacts between strands 
G and F. Purple line: weighted fraction of native contacts between strands A and B. Cyan 
line: weighted fraction of native contacts between strands C and F. 
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